14 research outputs found

    The Capacity of Some P\'olya String Models

    Get PDF
    We study random string-duplication systems, which we call P\'olya string models. These are motivated by DNA storage in living organisms, and certain random mutation processes that affect their genome. Unlike previous works that study the combinatorial capacity of string-duplication systems, or various string statistics, this work provides exact capacity or bounds on it, for several probabilistic models. In particular, we study the capacity of noisy string-duplication systems, including the tandem-duplication, end-duplication, and interspersed-duplication systems. Interesting connections are drawn between some systems and the signature of random permutations, as well as to the beta distribution common in population genetics

    Bounds and Constructions for Generalized Batch Codes

    Full text link
    Private information retrieval (PIR) codes and batch codes are two important types of codes that are designed for coded distributed storage systems and private information retrieval protocols. These codes have been the focus of much attention in recent years, as they enable efficient and secure storage and retrieval of data in distributed systems. In this paper, we introduce a new class of codes called \emph{(s,t)(s,t)-batch codes}. These codes are a type of storage codes that can handle any multi-set of tt requests, comprised of ss distinct information symbols. Importantly, PIR codes and batch codes are special cases of (s,t)(s,t)-batch codes. The main goal of this paper is to explore the relationship between the number of redundancy symbols and the (s,t)(s,t)-batch code property. Specifically, we establish a lower bound on the number of redundancy symbols required and present several constructions of (s,t)(s,t)-batch codes. Furthermore, we extend this property to the case where each request is a linear combination of information symbols, which we refer to as \emph{functional (s,t)(s,t)-batch codes}. Specifically, we demonstrate that simplex codes are asymptotically optimal functional (s,t)(s,t)-batch codes, in terms of the number of redundancy symbols required, under certain parameter regime.Comment: 25 page

    The capacity of some Pólya string models

    Get PDF
    We study random string-duplication systems, called Pólya string models, motivated by certain random mutation processes in the genome of living organisms. Unlike previous works that study the combinatorial capacity of string-duplication systems, or peripheral properties such as symbol frequency, this work provides exact capacity or bounds on it, for several probabilistic models. In particular, we give the exact capacity of the random tandem-duplication system, and the end-duplication system, and bound the capacity of the complement tandem-duplication system. Interesting connections are drawn between the former and the beta distribution common to population genetics, as well as between the latter system and signatures of random permutations

    Repeat-Free Codes

    Full text link
    In this paper we consider the problem of encoding data into repeat-free sequences in which sequences are imposed to contain any kk-tuple at most once (for predefined kk). First, the capacity and redundancy of the repeat-free constraint are calculated. Then, an efficient algorithm, which uses a single bit of redundancy, is presented to encode length-nn sequences for k=2+2log(n)k=2+2\log (n). This algorithm is then improved to support any value of kk of the form k=alog(n)k=a\log (n), for 1<a1<a, while its redundancy is o(n)o(n). We also calculate the capacity of repeat-free sequences when combined with local constraints which are given by a constrained system, and the capacity of multi-dimensional repeat-free codes.Comment: 18 page

    Binary t1t_1-Deletion-t2t_2-Insertion-Burst Correcting Codes and Codes Correcting a Burst of Deletions

    Full text link
    We first give a construction of binary t1t_1-deletion-t2t_2-insertion-burst correcting codes with redundancy at most log(n)+(t1t21)loglog(n)+O(1)\log(n)+(t_1-t_2-1)\log\log(n)+O(1), where t12t2t_1\ge 2t_2. Then we give an improved construction of binary codes capable of correcting a burst of 44 non-consecutive deletions, whose redundancy is reduced from 7log(n)+2loglog(n)+O(1)7\log(n)+2\log\log(n)+O(1) to 4log(n)+6loglog(n)+O(1)4\log(n)+6\log\log(n)+O(1). Lastly, by connecting non-binary bb-burst-deletion correcting codes with binary 2b2b-deletion-bb-insertion-burst correcting codes, we give a new construction of non-binary bb-burst-deletion correcting codes with redundancy at most log(n)+(b1)loglog(n)+O(1)\log(n)+(b-1)\log\log(n)+O(1). This construction is different from previous results.Comment: Results are covered by others' wor

    Throughput and Delay Analysis for Coded ARQ

    No full text
    © 2019 IFIP. We propose a Coded selective-repeat ARQ protocol with cumulative feedback, by building on the uncoded baseline scheme for ARQ, developed by Ausavapattanakun and Nosratinia. Our method leverages discrete-time queuing and coding theory to analyze the performance of the proposed data transmission method. We incorporate forward error-correction (FEC) to reduce in-order delivery delay, and exploit a matrix signal-flow graph approach to analyze the throughput and delay. We demonstrate and contrast the performance of the Coded ARQ protocol with that of the uncoded ARQ scheme, with minimum coding, i.e., with a sliding window of size 2. Coded ARQ can provide gains up to about 40% in terms of throughput. It also provides delay guarantees, and is robust to various challenges such as imperfect and delayed feedback, burst erasures, and round-trip time fluctuations.United States. Defense Advanced Research Projects Agency (Prime Award HR0011-17-C-0050)Intel Corporatio
    corecore